- Introductions
- Class overview
- Getting R up and running
Photo by Belinda Fewings on Unsplash
Photo by Belinda Fewings on Unsplash
Carrie Wright
Assistant Scientist, Department of Biostatistics, JHSPH
PhD in Biomedical Sciences
Email: cwrigh60@jhu.edu
Website: carriewright11.github.io
Ava Hoffman
Research Associate, Department of Biostatistics, JHSPH
PhD in Ecology
Email: ava.hoffman@jhu.edu
Website: avahoffman.com
Candace Savonen
Research Associate, Department of Biostatistics, JHSPH
Masters in Neuroscience
Former Data Analyst for Childhood Cancer Data Lab
Email: csavone1@jhu.edu
Website: https://www.cansavvy.com/

Grant Schumock
PhD Candidate, Department of Biostatistics, JHSPH
BS in Nuclear Engineering
Email: gschumo1@jhmi.edu
Qier Meng
ScM Student, Department of Biostatistics, JHSPH
Bachelor’s Degree in Mathematics
Bachelor’s Degree in Neuroscience
Email: qmeng11@jhmi.edu
R is a language and environment for statistical computing and graphics
R is the open source implementation of the S language, which was developed by Bell laboratories in the 70s.
The aim of the S language, as expressed by John Chambers, is “to turn ideas into software, quickly and faithfully”
![]()
(source: http://www.r-project.org/, https://en.wikipedia.org/wiki/S_(programming_language), https://en.wikipedia.org/wiki/Bell_Labs)
In 1991 Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand began developing R
R is named partly after the first names of the first two authors and a play on the name of S.
R is both open source and open development

(source: http://www.r-project.org/, https://en.wikipedia.org/wiki/R_(programming_language) )
High level language designed for statistical computing
Powerful and flexible - especially for data wrangling and visualization
Free (open source)
Extensive add-on software (packages)
Strong community
(source: https://rladies-baltimore.github.io/)
Fairly steep learning curve
“Programming” oriented
Minimal interface
Little centralized support, relies on online community and package developers
Annoying to update
Slower, and more memory intensive, than the more traditional programming languages (C, Java, Perl, Python)
What do you hope to get out of the class?
Why do you want to use R?
Photo by Nick Fewings on Unsplash
http://jhudatascience.org/intro_to_r
Materials will be uploaded the night before class
## Course Format
## CoursePlus
CoursePlus: https://courseplus.jhu.edu/core/index.cfm/go/syl:syl.public.view/coid/16733/
Surveys throughout the class for the instructors.
End of class Survey - link in email.
Homeworks and Final Project due by Wednesday, Jan 26, 2022 at 11:59pm EST.
If you turn homework in earlier this can allow us to potentially give you feedback earlier.
Note: Only people taking the course for credit must turn in the assignments. However, we will evaluate all submitted assignments in case others would like feedback on their work.
Install the latest version from: http://cran.r-project.org/
RStudio is an integrated development environment (IDE) that makes it easier to work with R.
More on that soon!
This course will involve moving files around on your computer and downloading files.
If you are new to this - check out these videos
If you have a PC: https://youtu.be/we6vwB7DsNU
If you have a Mac: https://www.youtube.com/watch?v=Ao9e0cDzMrE
Packages are sort of analogous to a software application like Microsoft Word on your computer. Your operating system allows you to use it, just like having R installed (and other required packages) allows you to use packages.
A function might help you add numbers together, create a plot, or organize your data. More on that soon!
We will mostly show you how to use tidyverse packages and functions.
This is a newer set of packages that cane make your code more intuitive or readable.
We have an R package called jhur that will make sure all the packages are installed.
You can just copy and paste the below code into your console - we’ll explain what it all means in the next day or two
install.packages("remotes")
remotes::install_github("muschellij2/jhur")
Note it may take ~5-10 minutes to run.
Want more?
- Tidyverse Skills for Data Science Book: https://jhudatascience.org/tidyversecourse/
- Tidyverse Skills for Data Science Course (can get certificate): https://www.coursera.org/specializations/tidyverse-data-science-r
- R for Data Science: http://r4ds.had.co.nz/ - Open Case Studies: https://www.opencasestudies.org/ - Dataquest: https://www.dataquest.io/
Need help?
- Various “Cheat Sheets”: https://www.rstudio.com/resources/cheatsheets/
- R reference card: http://cran.r-project.org/doc/contrib/Short-refcard.pdf
- R terminology: https://cran.r-project.org/doc/manuals/r-release/R-lang.pdf
Interested in Reproducibility? Check out Candace’s courses at: https://jhudatascience.org/Reproducibility_in_Cancer_Informatics/ and https://jhudatascience.org/Adv_Reproducibility_in_Cancer_Informatics/